Search CORE

187 research outputs found

Analyzing inexact hypergradients for bilevel learning

Author: Ehrhardt Matthias j
Roberts Lindon
Publication venue
Publication date: 30/11/2023
Field of study

Analyzing Inexact Hypergradients for Bilevel Learning

Author: Ehrhardt Matthias J.
Roberts Lindon
Publication venue
Publication date: 14/11/2023
Field of study

Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters cannot be feasibly computed and approximate strategies are required. We introduce a unified framework for computing hypergradients that generalizes existing methods based on the implicit function theorem and automatic differentiation/backpropagation, showing that these two seemingly disparate approaches are actually tightly connected. Our framework is extremely flexible, allowing its subproblems to be solved with any suitable method, to any degree of accuracy. We derive a priori and computable a posteriori error bounds for all our methods, and numerically show that our a posteriori bounds are usually more accurate. Our numerical results also show that, surprisingly, for efficient bilevel optimization, the choice of hypergradient algorithm is at least as important as the choice of lower-level solver.Comment: Accepted to IMA Journal of Applied Mathematic

arXiv.org e-Print Archive

On Optimal Regularization Parameters via Bilevel Learning

Author: Ehrhardt Matthias J.
Gazzola Silvia
Scott Sebastian J.
Publication venue
Publication date: 28/05/2023
Field of study

Variational regularization is commonly used to solve linear inverse problems, and involves augmenting a data fidelity by a regularizer. The regularizer is used to promote a priori information, and is weighted by a regularization parameter. Selection of an appropriate regularization parameter is critical, with various choices leading to very different reconstructions. Existing strategies such as the discrepancy principle and L-curve can be used to determine a suitable parameter value, but in recent years a supervised machine learning approach called bilevel learning has been employed. Bilevel learning is a powerful framework to determine optimal parameters, and involves solving a nested optimisation problem. While previous strategies enjoy various theoretical results, the well-posedness of bilevel learning in this setting is still a developing field. One necessary property is positivity of the determined regularization parameter. In this work, we provide a new condition that better characterises positivity of optimal regularization parameters than the existing theory. Numerical results verify and explore this new condition for both small and large dimensional problems.Comment: 26 pages, 6 figure

arXiv.org e-Print Archive

A temporal multiscale approach for MR Fingerprinting

Author: Cortinhas Samuel
Ehrhardt Matthias J.
Golbabaee Mohammad
Publication venue
Publication date: 07/10/2020
Field of study

Quantitative MRI (qMRI) is becoming increasingly important for research and clinical applications, however, state-of-the-art reconstruction methods for qMRI are computationally prohibitive. We propose a temporal multiscale approach to reduce computation times in qMRI. Instead of computing exact gradients of the qMRI likelihood, we propose a novel approximation relying on the temporal smoothness of the data. These gradients are then used in a coarse-to-fine (C2F) approach, for example using coordinate descent. The C2F approach was also found to improve the accuracy of solutions, compared to similar methods where no multiscaling was used.Comment: 4 pages, 3 figures. Title revise

arXiv.org e-Print Archive

OPUS

On the convergence and sampling of randomized primal-dual algorithms and their application to parallel MRI reconstruction

Author: Delplancke Claire
Ehrhardt Matthias J
Gutierrez Eric B
Publication venue
Publication date: 25/07/2022
Field of study

Stochastic Primal-Dual Hybrid Gradient (SPDHG) is an algorithm to efficiently solve a wide class of nonsmooth large-scale optimization problems. In this paper we contribute to its theoretical foundations and prove its almost sure convergence for convex but neither necessarily strongly convex nor smooth functionals. We also prove its convergence for any sampling. In addition, we study SPDHG for parallel Magnetic Resonance Imaging reconstruction, where data from different coils are randomly selected at each iteration. We apply SPDHG using a wide range of random sampling methods and compare its performance across a range of settings, including mini-batch size and step size parameters. We show that the sampling can significantly affect the convergence speed of SPDHG and for many cases an optimal sampling can be identified

arXiv.org e-Print Archive

OPUS